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Abstract 

A new class of statistical deformable models is introduced to study high-dimensional curves or 
images. In addition to the standard measurement error term, these deformable models include an 
extra error term modeling the individual variations in intensity around a mean pattern. It is shown 
that an appropriate tool for statistical inference in such models is the notion of sample Frechet means, 
which leads to estimators of the deformation parameters and the mean pattern. The main contribution 
of this paper is to study how the behavior of these estimators depends on the number n of design points 
and the number J of observed curves (or images). Numerical experiments are given to illustrate the 
finite sample performances of the procedure. 
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1 Introduction 



1.1 A statistical deformable model for curve and image analysis 

In many applications, one observes a set of curves or grayscale images which are high-dimensional data. 
In such settings, it is reasonable to assume that the data at hand Yj, denoting the ^-th observation for 
the j-th curve (or image), satisfy the following regression model: 

yf = fj{U) + *4 3 = !> " • • > J > and ^ = 1, . . . , n, (1.1) 

where fj : f2 — > R are unknown regression functions (possibly random) with £1 a convex subset of R d , the 
tg's are non-random points in Q, (deterministic design), the error terms are i.i.d. normal variables with 
zero mean and variance 1, and a > 0. In this paper, we will suppose that the fj's are random elements 
which vary around the same mean pattern. Our goal is to estimate such a mean pattern and to study the 



consistency of the proposed estimators in various asymptotic settings: either when both the number n of 
design points and the number J of curves (or images) tend to infinity, or when n (resp. J) remains fixed 
while J (resp. n) tends to infinity. 

In many situations, data sets of curves or images exhibit a source of geometric variations in time 
or shape. In such settings, the usual Euclidean mean Y E = j Ylj=i Yj m model (jl.ip cannot be used 
to recover a meaningful mean pattern. Indeed, consider the following simple model of randomly shifted 
curves (with d = 1) which is commonly used in many applied areas such as neurosciencc [TIR10] or 
biology |R0nOl| , 

fj(k) = f(te-0*), j = l,...,J, and £ = l,...,n, (1.2) 
where / : Q — > R is the mean pattern of the observed curves, and the O^s are i.i.d. random variables in 
R with density g and independent of the e~-'s. In model (jl.2p . the shifts 0* represent a source of variability 
in time. However, in (jl.2p the Euclidean mean is not a consistent estimator of the mean pattern / since 
by the law of large numbers 

J 

a.s. 



lim Y e = lim - V m ~ &j) = ( f(ti ~ 0)g{6)de 



J— >oo 

3 

The randomly shifted curves model (jl.2p is close to the perturbation model introduced by [Goo91J 
in shape analysis for the study of consistent estimation of a mean pattern from a set of random planar 
shapes. The mean pattern to estimate in |Goo91| is called a population mean, but to stress the fact that 
it comes from a perturbation model [HuclO uses the term perturbation mean. To achieve consistency in 
such models, a Procrustean procedure is used in [Goo91], which leads to the statistical analysis of sample 
Frechet means |Fre48j which are extensions of the usual Euclidean mean to non-linear spaces using non- 
Euclidean metrics. For random variables belonging to a nonlinear manifold, a well-known example is the 
computation of the mean of a set of planar shapes in the Kendall's shape space [Ken84] which leads to the 
Procrustean means studied in [Goo91| . Consistent estimation of a mean planar shape has been studied 
by various authors, see e.g. |Goo91l [KM971 [KBCL991 ILe981 ILKOO] . A detailed study of some properties 
of the Frechet mean in finite dimensional Riemannian manifolds (such as consistency and uniqueness) has 
been performed in [Zie77l IUC951 IBP031 IBP051 IHuclOl IHuclll lAfsll] . 

The main goal of this paper is to introduce statistical deformable models for curve and image analysis 
that are analogue to GoodalPs perturbation models [Goo91], and to build consistent estimators of a mean 
pattern in such models. Our approach is inspired by Grenander's pattern theory which considers that 
the curves or images fj in model (II. ip are obtained through the deformation of a mean pattern by a Lie 
group action [Gre93:, GM07j . In the last decade, there has been a growing interest in transformation Lie 
groups to model the geometric variability of images, and the study of the properties of such deformation 
groups is now an active field of research (see e.g. [MY011 ITY05] and references therein). There is also 
currently a growing interest in statistics on the use of Lie group actions to analyze geometric modes of 
variability of a data set [HHMlOal iHHMlObj . 

To describe more formally geometric variability, denote by L 2 (f2) the set of square integrable real- 
valued functions on Q, and by V an open subset of R p . To the set V, we associate a parametric family 
of operators {Te)eeV such that for each 6 G V the operator Tg : L 2 (Q.) — > L 2 (Q) represents a geometric 
deformation (parametrized by 6) of a curve or an image. Examples of such deformation operators include 
the cases of: 

- Shifted curves: Tgf(t) := f(t — 6), with Q = [0, 1], / € Lp er ([0, 1]) (the space of periodic functions in 

L 2 ([0, 1]) with period 1) and V an open set of R. 

- Rigid deformation of two-dimensional images: 

Tefit) := / (e a R a t - b) , for = (a, a, b) G V, 
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with O = R 2 , P C K x R x R 2 where R a = ( cos ( a } S1 ^") ] is a rotation matrix in R 2 , e a is 

\ sm(a) cos(a) J 

an isotropic scaling and b a translation in R 2 . 

- Deformation by a Lie group action: the two above cases are examples of a Lie group action on the 

space L 2 (Q) (see [HelOl] for an introduction to Lie groups). More generally, assume that G is a 
connected Lie group of dimension p acting on f2, meaning that for any (g, t) £ Gxfi the action • of 
G onto SI is such that g ■ t G Q. In general, G is not a linear space but can be locally parametrized 
by a its Lie algebra Q ~ W using the exponential map exp : Q —> G. If V C W. This leads for 
(6, /)gPx L 2 (Q) to define the deformation operators 

T e f(t) := / (exp(0) • t) . 

- Non-rigid deformation of curves or images: assume that one can construct a family {i^e)e&v of paramet- 

ric diffeomorphisms of Q (see e.g. |BGL09j ). Then, for (#,/) ePx L 2 (Q), define the deformation 
operators 

Tgf{t) :=/(#(*)). 

Then, in model (jl.ip . we assume that the /j's have a certain homogeneity in structure in the sense that 
there exists some / G L 2 {9) such that 

fj(t) = T e * [f + Zj] (t), for all t G O, and j = 1, . . . , J, (1.3) 

where Oj G V, j = 1, . . . , J are i.i.d. random variables (independent of the ej's) with an unknown density 
g with compact support included in T 7 satisfying: 

Assumption 1.1. T/ie density g of the 6* 's is continuously differentiate on V and has a compact support 
O included in V C W. We assume that can be written 

Q = {0 = (e 1 ,...,6 p )eM. p , \9 Pl \<p, l<pi<p} (1.4) 

where p > 0. 

The function / in model (II. 3p represents the unknown mean pattern of the fj's. The Zj's are supposed 
to be independent of the e^s and are i.i.d. realizations of a second order centered Gaussian process Z 
taking its values in L 2 (Q,). The Zj's represent the individual variations in intensity around /, while the 
random operators Te j model geometric deformations in time or space. Then, if we assume that the T^'s 
are linear operators, equation fjl .3|) leads to the following statistical deformable model for curve or image 
analysis 

Yf = T e *f(t e ) + TyZjiU) +ae], j = 1, . . . , J, and£ = l n, (1.5) 

where Ej are i.i.d. normal variables with zero mean and variance 1. 

Model (|1.5p could be also called a perturbation model using the terminology in |Goo9H IHuclO] for 
shape analysis. To be more precise, let Y G R nx2 be a set of n points in R 2 representing a planar shape. 
Define a deformation operator Tg for 6 = (a, a,b) G = R x [0, 2ir] x R 2 acting on R nx2 in the following 

way 

TqY = e"YR a + l n b>, where R a = ( C °f\ ~ S ^ ) , 

y sm(a) cos(a) J 

and l n = (1, . . . , 1)' G R n . Consistent estimation of a mean shape has been first studied in [Goo91] when 
a set of random shapes Yi, . . . , Yj is drawn from the following perturbation model 

Y j =T *(p + C j ), j = l,...,J. (1.6) 
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Model (jl.6p is similar to the statistical deformable model (jl.5p . where \i G W ix2 is the unknown 
perturbation mean to estimate, and £j are i.i.d. random vectors in W ix2 with zero mean. Nevertheless, 
there exists major differences between our approach and the one in |Goo91j . First, in model §Tj$, the 
deformations parameters 9* are assumed to be random variables following an unknown distribution, 
whereas they are just nuisance parameters in model (jl.6jl for shape analysis, see [Goo91[ IKM 97], In 
some applications (e.g. in biomedical imaging [JDJG04J), it is of interest to reconstruct the unobserved 
parameters 0*j and to estimate their distribution. One of the main contribution of this paper is then to 
construct upper and lower bounds for the estimation of such deformation parameters. Moreover, in model 
(jl.5p . they are too additive error terms, whereas the model (|1.6p only include the error term In model 
(jl.5p . the Ej is an additive noise modeling the errors in the measurements, while the ZjS model (possibly 
smooth) variations in intensity of the individuals around the mean pattern /. 

In [KM97j . the authors studied the relationship between isotropicity of the additive noise Cj an d the 
convergence of Procrustean procedures to the perturbation mean \x as J — > +00. It is shown in [KM97J 
that, for isotropic errors, Procrustean means are consistent, but that, for non-isotropic errors, they may 
not converge to /i. For a recent discussion on the issues of consistency of sample Procrustes means in 
perturbation models and extension to non-metrical Frechet means, we refer to [HuclOj and [Hucllj . In this 
paper, we carefully analyze the role of the dimension n and the number of samples J on the consistency 
of Procrustean means in model (jl.5p . To obtain consistent procedures, we show that it is not required to 
impose very restrictive conditions on the error terms Zj such as isotropicity for the C,j in (II. 6p for shape 
analysis. Here, the key quantity is the dimension n of the data (number of design points) which plays the 
central role to guarantee the converge of our estimators. This point is another major difference with the 
approach of statistical shape analysis [Goo9l] that does not take into account the dimensionality of the 
shape space to analyze the consistency of Procrustean estimators. 

Note that a subclass of the deformable model (jl.5p is the so-called shape invariant model (SIM) 

Y} = T *f{tt) + ae], j = 1, . . . , J, and i = 1, . . . , n, (1.7) 

i.e. without incorporating in (jl.5p the additive terms Zj. 

The goal of this paper is twofold. First, we propose a general methodology for estimating / and the 
#*'s based on observations coming from model (II. 5\\ . For this purpose, we show that an appropriate tool is 
the notion of sample Frechet mean of a data set [Frc48j IZie771 IBP03| that has been widely studied in shape 
analysis |Goo9H IKM971 lLe98l ILK00| IHuclOj and more recently in biomedical imaging [JD JG041 lPen06] . 
Secondly, we study the consistency of the resulting estimators in various asymptotic settings: either when 
n and J both tend to infinity, or when n is fixed and J — > +00, or when J is fixed and n — > +00. 

1.2 Organization of the paper 

Section [2] contains a description of our estimating procedure and a review of previous work in mean pattern 
estimation. In Section [3l we derive a lower bound for the quadratic risk of estimators of the deformation 
parameters. In Section HI we discuss some identifiability issues in model (jl.5p . In Section [5] we derive 
consistency results for the Frechet mean in the case (jl.2p of randomly shifted curves. In Section [6] and 
Section [71 we give general conditions to extend these results to the more general deformable model (jl.5p . 
Section [8] contains some numerical experiments. A small conclusion with some perspectives are given in 
Section [9j All proofs are postponed to a technical Appendix. 
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2 The estimating procedure 



2.1 A dissimilarity measure based on deformation operators 

To define a notion of sample Frechet mean for curves or images, let us suppose that the family of defor- 
mation operators {Tq)q £ -p is invertible in the sense that there exists a family of operators (Tg^g-p such 
that for any (0, f) 6 V x L 2 (Q) 

fgf e L\Q) and f e T f = f. 
Then, for two functions /, h € L 2 (Q) introduce the following dissimilarity measure 

dl{hJ)=Mj^(f e h{t)-f{t)fdt. 

If d\{h, f) = then there exists G V such that / = Tgh meaning that the functions / and h are 
equal up to a geometric deformation. Note that dx is not necessarily a distance on L 2 (0), but it can be 
used to define a notion of sample Frechet mean of data from model (II. 5h . For this purpose let T denote 
a subspace of L 2 (Q) and suppose that fj are smooth functions in T C L 2 (f2) obtained from the data 
Yf, I = l,...,n for j = 1.....J, see Section 15.21 and Section 16.21 for precise definitions. Following the 
definition of a Frechet mean in general metric space |Fre48] , define an estimator of the mean pattern / as 

1 3 

/ = argmin-V4(/ JJ /). (2.1) 

Note that / falls into the category of non-metrical sample Frechet means whose definitions and asymptotic 
properties are discussed in |HuclO| for random variables belonging to Riemannian manifolds. However, 
unlike the usual approach in shape analysis, the Frechet mean f)2. 1 1) is based on smoothed data. In what 
follows, we show that smoothing is a key preliminary step to obtain the convergence of / to the mean 
pattern / in the deformable model (jl.5p . It can be easily shown that the computation of / can be done 
in two steps: first minimize the following criterion 

(6 u ...,6j)= argmin M(9 U . . . , Oj), (2.2) 
(0i,..,0j)e6 J 



where 



J J ' 

M(0 1 ,...,e J ) = jfl I (fe j f j (t)-j^2f .J f (t)\ dt, (2.3) 
j=i ^ n V -, =l J 

which gives an estimation of the deformation parameters 6\, . . . , 6*j, and then in a second step take 

1 J 

f(t) = -fJ2 f 9 for ( 2 -4) 
J 3=1 3 

as an estimator of the mean pattern /. 

Note that this two steps procedure belongs to the category of Procrustean methods (see e.g [DM98, 
Goo91J). A similar approach to (2,2) has been developed by [JDJG04J in the context of biomedical images 
using diffeomorphic deformation operators. 
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2.2 Previous work in mean pattern estimation and geometric variability analysis 



Estimating the mean pattern of a set of curves that differ by a time transformation is usually referred 
to as the curves reg istration problem, see e.g. [GK921 |Big06j IRLOU IWG97| ILM04| . However, in these 
papers, studying consistent estimators of the mean pattern / as the number of curves J and design points 
n tend to infinity is not considered. For the SIM (II. 7p . a semiparametric point of view has been proposed 
in [GLM07j and [Vim 10] to estimate non-random deformation parameters (such as shifts and amplitudes) 
as the number n of observations per curve grows, but with a fixed number J of curves. A generalisation 
of this semiparametric approach for two-dimensional images is proposed in [BGV09]. The case of image 
deformations by a Lie group action is also investigated in [BLV10] from a semiparametric point of view 
using a SIM. 

In the simplest case of randomly shifted curves in a SIM, [BG10] have studied minimax estimation of 
the mean pattern / by letting only the number J of curves going to infinity. Self-modelling regression (SE- 
MOR) methods proposed by [KG88j are semiparametric models where each observed curve is a parametric 
transformation of the same regression function. However, the SEMOR approach does not incorporate a 
random fluctuations in intensity of the individuals around a mean pattern / through an unknown process 
Zj as in model (|1.5p . The authors in [KG88] studied the consistency of the SEMOR approach using a 
Procrustean algorithm. Recently, there has also been a growing interest on the development of statistical 
deformable models for image analysis and the construction of consistent estimators of a mean pattern, 
see |GM01l IBGV091 IBGL091 IAAT071 IAKT09j . 



3 Lower bounds for the estimation of the deformation parameters 

In this section, we derive non-asymptotic lower bounds for the quadratic risk of an arbitrary estimator of 
the deformation parameters under the following smoothness assumption of the mapping (9, t) i — > Tgf(t). 

Assumption 3.1. For all 9 = (6 1 , . . . ,9 P ) G V , Tq : L?{Q) — > L 2 (J7) is a linear operator such that 
the function t i — > dgpiTof(t) exists and belongs to L 2 (Q) for any p\ = 1, . . . ,p. Moreover, there exists a 
constant C(Q,f) > such that 

\\d en T e f\\ 2 L2 <C(@,f), 

for all pi = 1, . . . ,p and 9 G 0. 



3.1 Shape Invariant Model 



Theorem 3.1. Consider the SIM (II. 7h and suppose that Assumption \3.1\ holds. Assume that g satisfies 
Assumption \l.l\ and that J Q \\do log (g(6)) || 2 g{0)dO < +oo. Let G V J be any estimator (a measurable 
function of the data) of 9* = (9*, . . . , 9j). Then, for any n > 1 and J > 1, 



E 



WO O* l|2 

-jWV-v lb 



> 



a 2 n 1 



(3.1) 



C(6, /) + ^n- 1 J e \\d e log (g(9)) f g(9)d9 ' 
where C(Q,f) is the constant defined in Assumption \3.1\ and ||-|| rp j is the standard Euclidean norm in 



The lower bound given in inequality (13. ip does not decrease as J increases. Thus, if the number n of 
design points is fixed, increasing the number J of curves (or images) does not improve the quality of the 
estimation of the deformation parameters for any estimator 9. Nevertheless, this lower bound is going to 
as the dimension n —> +oo. 
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3.2 General model 



The main difference between the general model f)1.5|) and the SIM f)1.7|) is the extra error terms Tg*Zj, 

j = 1, . . . , J. In what follows, E#[ • ] denotes expectation conditionally to 9 E @ J . Since the random 
processes Z^s are observed through the action of the random deformation operators Tg* it is necessary 
to specify how the Tg* 's modify the law of the process Zj . 

Assumption 3.2. There exists a positive semi- definite symmetric n x n matrix £ n (0) such that the 
covariance matrix of Z = [Z(te)]" =l satisfies ¥,g[TgZ(TgZ)'~\ = E n (0). 

This assumption means that the law of the random process Z is somewhat invariant by the deformation 
operators Tg. Such an hypothesis is similar to the condition given in |KM97] to ensure consistency of 
Frechet mean estimators in Kendall's shape space using model similar to (|1.5|) with a = 0. After a 
normalization step, the deformations considered in [KM97] are rotations of the plane, and the authors 
in |KM97] study the case where the law of the error term Z is isotropic, that is to say, invariant by the 
action of rotations. 

Theorem 3.2. Consider the general model (jl.5p . Suppose that Assumption \3. 1\ and \3.^ hold. Assume 
that the density g satisfies Assumption \l. 1\ and that J Q 1 1 c?^ log (g{6))\\ 2 g(6)d6 < +oo. Let £ V J be any 
estimator (a measurable function of the data) of 0* = (0*, . . . , 0}). Then, for any n > 1 and J > 1, we 
have 

> (a 2 + sl(e))n- 1 

" C(@,f) + {o* + «a(9))n-i f e \\d gi log(g(0))\\ 2 g{6)d0- 

where C(0, /) is the constant defined in Assumption \3.1\ and s^(Q) denotes the smallest eigenvalue of 
S„(G). 

Again, the lower bound (|3,2p does not depends on J. Thus, increasing the number J of observations 
does not decrease the quadratic risk of any estimator of the deformations parameters. Moreover, the 
lower bound f|3.2|) tends to zero as n — > +oo only if lim n _> +00 n~ 1 s 2 l (0) = 0. 



E 



J 1 



10-0* 



|2 



3.3 Application to the shifted curves model 

Consider the shifted curves model (11. 2} with an equi-spaced design, namely 

Y i =f(i~ °j) + Z i H - d D + ^4 j = 1, . . . , J, and I = 1, . . . ,n. 



(3.3) 



Theorem 3.3. Consider the model (|3.3p . Assume that f is continuously differ entiable on [0, 1] and that 
Z is a centered stationary process with value in Lp er ([0, 1]). Suppose that = [— p, p] with p < \ and 
Jq (dg log (g(0))) 2 g(6)d0 < +oo. Let 6 £ M. J be any estimator of the true random shifts 6* = (0*, . . . , 0}), 
i.e. a measurable function of the data in model \3. 3\) . Then, for any n > 1 and J > 1 



E 



J 



do* l|2 



> 



— 1 2 

n a 



dtf\\L + n- l ° 2 le ( d e log (g(0))Y 9{0)d6 



(3.4) 



where WdtfW^ = sup tg [ 01 ] {|<9t/(i)|} with dtf denoting the first derivative of f. 
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4 Identifiability conditions 



4.1 The shifted curves model 

Without any further assumptions, the randomly shifted curves model (13.31) is not identifiable. Indeed, if 
#o G © satisfies 6j + 6q G 0, j '■ = 1, . . . , J, then replacing /(•) by /(• — Oq) and 0j by 6* + #o does not 
change the formulation of model ()3.3f) . Choosing identifiability conditions amounts to impose constraints 
on the minimization of the criterion 




for 6 = ... , Oj) G G J , which can be interpreted as a version without noise of the criterion (|2.2p using 
the ideal smoothers fj(-) = /(• — 0*). Obviously, the criterion D(6) has a minimum at 0* = (O^, . . . ,6j) 
such that D{6*) = 0, but this minimizer of D on Q J is clearly not unique. If the true shifts are supposed 
to have zero mean (i.e. j & Og(0)dO = 0) it is natural to introduce the constrained set 

6 o = {(0i,... 1 0j)ee i , e 1 + ... + e J = o}. (4.2) 

It is shown in |BG10j Lemma 6, that if / G L 2 ([0, 1]) is such that f(t)e- i2wt dt ^ and if p < 1/4 (recall 
that = [— p, p\), then the criterion D(6) has a unique minimum on ©o in the sense that D(6) > D(0@ o ) 
for all 6 G @o with / 0@ Q where 

1 J 

00 (J = (ot-d*,...,e*j-d*) with 0* = -J>*. (4.3) 

Under such assumptions, we will compute estimators of the random shifts by minimizing the criterion 
(|2.2p over the constrained set @o an d not directly on Q J . Consistency of such constrained estimators will 
then be studied under the following identifiability conditions: 

Assumption 4.1. The mean pattern f is such that f(t)e~ t2nt dt ^ 0. 

Assumption 4.2. The support of the density g is included in [— p',p'] for some < p' < ^ < 1/4 and is 
such that j e 6g(0)d6 = 0. 

Under such assumptions, D{6) can be bounded from below by the quadratic function j\\6 — #© || 2 
which will be an important property to derive consistent estimators. 

Proposition 4.1. Suppose that Assumptions \4^T\ and \4-%\ hold with p < 1/16. Then, for any 9 = 
(#i, . . . , Oj) G @o> one has that 

D(9)-D(e* &0 )>c(f,p)j\\e-e* &o \\ 2 , 

where C(f,p) > is a constant depending only on f and p. 

Assumption 14.21 and the condition that p < 1/16 in Proposition 14.11 mean that the support of the 
density g of the shifts is sufficiently small, and that the shifted curves fj{t) = f(t — 0j) are in some sense 
concentrated around the mean pattern /. Such an assumption of concentration of the data around the 
same mean pattern has been used in various papers to prove the uniqueness and the consistency of Frechet 
means for random variables lying in a Riemannian manifold, see [Kar77, ILe98l IBP031 lAfslH IKen 90j. 
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4.2 The general case 

In the case of general deformation operators, define for = (0\, . . . , Oj) G Q J the criterion 




Obviously, using that for all G 0, TgTgf = f, the criterion D(0) has a minimum at 0* = (0\, . . . ,0*j) 
such that D(0*) = 0. However, without any further restrictions the minimizer of D(0) is not necessarily 
unique on J . 

Assumption 4.3. Let C Q J such that there exists a unique 0@ G satisfying D(0@) = 0. 

Then, is the set onto which we will carry the minimization of the criterion M(0) (|2.3p . In the case of 
shifted curves and under Assumption 14.11 and 14.21 the only set onto which the criterion D vanishes is the 
line {0* + 6 tj, O G M} C R J , where 1 j = (1, . . . , 1)' G R J . An easy way to choose the set is to take 
a linear subset of Q J , see Figure [1] for an illustration. By considering the subset 

O = e J n tj = {(^, . . . , Oj) g e J , 0i + . . . + o,j = o}, 

where Ij 1 - is the orthogonal of lj in R J , then Assumption 14.31 is satisfied with #@ o given in (|4.3p . More 
generally, if the deformation parameters Oj, j = 1, . . . , J are supposed to be random variables with zero 
mean, then optimizing D(0) on ©o is a natural choice. Another identifiability condition for shifted curves 
is proposed in |GLM07| and |VimlO| by taking 

01 = G J n ex 1 = {(0!, . . . , Oj) G 6 J , 0! = 0}. (4.5) 

where e\ = (1, 0, . . . , 0) G R J . In this case, Q* &1 = (0, 0\ -0\, . . . , 0}- 0\). Choosing to minimize D(0) on 
©i amounts to choose the first curve as a reference onto which all the others curves are aligned, meaning 
that the first shift 0\ is not random, see Figure HJ 




Figure 1: Choice of identifiability conditions for shifted curves in the case J = 2. 

Following the classical guidelines in M-estimation (see e.g. |vdV98j ) . a necessary condition to ensure 
the convergence of M -estimators such as (|2.2p is that the local minima of D{0) over are well separated 
from the global minimum of D(0) at = 0@ (satisfying D(0@) = 0). The following assumption can be 
interpreted in this sense. 
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Assumption 4.4. For all G we have 

D(0) - D(0* & ) > C(@,T)j\\e - 6* & \\ 2 (4.6) 
for a constant C(&,J-) > independent of J. 

In the shifted curve model, Assumption 14.41 is verified if Assumption 14.11 and 14.21 hold (see Proposition 

ED. 



5 Consistent estimation in the shifted curves model 

In this section, we give conditions to ensure consistency of the estimators defined in Section [2] in the 
shifted curves model (]3.3p with an equi-spaced design. 

5.1 The random perturbations Zj 

Following the assummtions of Theorem 13.31 Z will be supposed to be a stationary process Z with covari- 
ance function R : [0, 1] — > R. The law of Z is thus invariant by the action of a shift. Conditionally to 
0* G 0, the covariance of the vector Tq*Zj = [Zj(^ — is a Toeplitz matrix equals to 



n 

1 



Let 

Imax(^n) be the largest eigenvalue of the matrix S n . It follows from standard results on Toeplitz 
matrices (see e.g. [HJ90] ) that 



1 - 

E n ) < lim ~J2\ R (%)\ =7 (5-2) 



7i 

n->+oo n 

k=l 



where 7 = / Q |i?(t)| is a positive constant independent of n representing an upper bound of the variance 
ofZ. 



5.2 Choice of the smoothed estimators /. 



3 



A convenient choice for the smoothing of the observed curves in (j3.3[) is to do low-pass Fourier filtering. 
Let £j t k = \ Y17=i Yje~ l27Tk ™ for k = — (n — l)/2, . . . , (n — l)/2 (assuming for simplicity that n is odd), 
and define for a spectral cut-off parameter A G N and t G [0, 1] the linear estimators 

// (*) = E ( 5 - 3 ) 

|Jfc|<A 

Then, define the Sobolev ball H S (A) of radius A > and regularity s > as 

H S (A) = {/ € L 2 per ([0, 1]), £(1 + |fc| 2 r |c fc (/)| 2 < A}. (5.4) 

with Cfc(/) = Jp 1 f(t)e~ t27Tkt dt, k G Z for a function / G Lp er ([0, 1]), and take J 7 = H S (A) as the smoothness 
class to which the mean pattern / is supposed to belong. 
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5.3 Consistent estimation of the random shifts 

Using low-pass filtering, and following the discussion in Section \A. II on identifiability issues, the estimators 
of the random shifts 6*, . . . ,0j are given by 

X = Oj) = argmin M A (6>i, . . . , Oj). (5.5) 

(e 1 ,...,0.,)e& o 

where the criterion M\(0) = M\(0 1} . . . , Oj) for 6 G 6 J is 

J J 2 

M ^°) = 7 E / Oft + °i) -iE/> + * 
J i=l Jn v J i'=l / 

and ©o is the constrained set defined in (|4.2jl . 

Theorem 5.1. Consider the model (|3.3p and Zei 6e i/ie estimator defined by (|5.5p . Assume that 
T = H S {A) for some A > and s > 1, and t/iai Z is a centered stationary process with value in 
L 2 er ([0, 1]) and covariance function R : [0,1] — >• M. Suppose that Assumptions \4-l\ and \4-S\ hold with 
p < 1/16. Then, for any A > 1 and x > 

j\\e X - e*\gj > C^e,?, f)A x {x, J,n, A, a 2 , 7) + A 2 (x, J) \ < 4e~ x , 

with Ai(x, J, n, A, a 2 , 7) = (a 2 + 7) J, n, A) +f(x, J, n, A)^ + (^J B(X, n) + B(X, n)J and ^(x, J) = 

VT + ij) > w ^ ere Ci(@, ?, f) > is constant depending only on 0, J 7 , /, J, n, A) = 2X ^ 1 (l + 4 j + 
, B{\, n ) = 2 ^ + \- 2s . andj = Ju\R(t)\dt. 

First, remark that for fixed values of n and A, then limj_> +00 A 2 (x, J) = 0. The term A\ (x, J, n, A, a 2 , 7) 
depends on the spectral cutoff A via the bias B(X,n) and the variance v{x, J, n, A) of the estimators /j. 
By choosing a sequence A = A n such that lim n ^ +00 A n = +00 and lim n _>+oo 77 = (tradeoff between 
low variance and low bias) it follows that for fixed J and x > 0, then lim„_>._|_ 00 J, n, A n , a 2 , 7) = 0. 
However, if n remains fixed, then limj-^+oo A\(x, J, n, A, a 2 , 7) > 0. 

Thus, Theorem 15.11 is consistent with the conclusions of Theorem 13.31 that is, if n is fixed, then it is 
not possible to estimate 0* by letting only J grows to infinity. Hence, under the assumptions of Theorem 
15.11 one can only prove the convergence in probability of to the true shifts 0* by taking the double 
asymptotic n — > +00 and J — > +00, provided the smoothing parameter A = A n is well chosen. 

5.4 Consistent estimation of the mean pattern 

In the case of randomly shifted curves, the Frechet mean estimator fl23J of / is f x (t) = ± £/=i ffrt+Oj). 
Theorem 5.2. Under the assumptions of Theorem \5.1[ for any A > 1 and x > 

||/ A - f\\h > C 2 (9, ?, f)A x {x, J, n, A, a 2 , 7) + C 3 (9, f)A 2 (x, J)) < 4e~*, 

where A%(x, J, n, A, c 2 ,7) and ^42(x, J) are defined in Theorem \5.1{ 62(6, J 7 , /) andC%(Q,f) are positive 
constants depending only on @,?,f, and \\f x — /|| 2 2 = — /(^)| 2 ^- 
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Similar comments to those made on the consistency of the estimators of the shifts can be made. A 
double asymptotic in n and J is needed to show that the Frechet mean / A converges in probability to the 
true mean pattern /. Moreover, if \ n is too large (e.g. such that lim„_> +00 / 0, which correspond to 
undersmoothing) , then Theorem 15.21 cannot be used to prove that f x converges to / in probability. This 
illustrates the fact that, to achieve consistency, a sufficient amount of pre-smoothing is necessary before 
computing the Frechet mean (|2.ip . 

5.5 A lower bound for the Frechet mean 

From the results of Theorem 13.31 it is expected that the Frechet mean f x does not converge to / in the 
setting n fixed and J — > +oo. To support this argument, consider the following ideal estimator 

J J 

f(t) = j + 9 S ) = - /(* - e * 3 + f o r all t G [0, 1], (5.6) 

i=i j=i 

where fj(t) = f(t — 9*),j = 1, . . . , J. This corresponds to the case of an ideal smoothing step from the 
data (|3.3|) that would yield fj = fj for all j = 1, . . . , J. Obviously, f(t) is not an estimator since it 
depends on the unobserved quantities / and 9*, but we can consider it as a benchmark to analyse the 
converge of the Frechet mean f x to /. 

Theorem 5.3. Suppose that the assumptions of Theorem \3.3\ are satisfied with p < j-. Then, for any 
n > 1, there exists Jo G N such that J > Jq implies 

n~^rr 2 

m\f-f\\ifi]>C(f,p) ~ - , (5.7) 

where the constant C(f,p) > depends on f and p. 

Hence, in the setting n fixed and J — > +oo, even the ideal estimator / does not converge to / for the 
expected quadratic risk. This illustrates the central role played by the dimension n of the data to obtain 
consistent estimators. 



6 Notations and main assumptions in the general case 

6.1 Smoothness of the mean pattern and the deformation operators 

In this part, the notation (Cg)g^-p is used to denote either {Te)e^v or their inverse (Tg)g e -p. 

Assumption 6.1. For all 9 G V, C e : L 2 (Q) — > L 2 (0) is a linear operator satisfying Cgf G T for all 

f G J- . There exists a constant C(0) > such that for any f G L 2 (Q) and 9 G 

l|/:«/ll 2 L2 <c(0)||/||| 2 , 

and a constant C(F, 0) > such that for any f G T and 9\,9<i G 0, 

\\f ei f -fgJ\\ 2 L2 <C{F,@)\\9 l -9 2 \\ 2 . 

Assumption 16.11 can be interpreted as a Lipschitz condition on the mapping (/, 9) i — > Cgf. The first 
inequality, that is ||£e/||^2 < C(0) ||/||^2, means that the action of the operator Cg does not change 
too much the norm of / when 9 varies in 0. Such an assumption on Tg and its inverse Tg forces the 
optimization problem (|2.2I) to have non trivial solutions by avoiding the functional M{9) in (12. 3ft being 
arbitrarily small. It can be easily checked that Assumption 16.11 is satisfied in the case (|1.2p of shifted 
curves with T = H S {A) and s > 1 . 
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6.2 The preliminary smoothing step 

For j = 1, . . . , J the fj's are supposed to belong to the class of linear estimators in the sense of the 
following definition: 

Definition 6.1. Let A denote either N orR + (set of smoothing parameters). To every A G A is associated 
a non-random vector valued function S\ : Q — > W 1 such that for all j = 1, . . . , J and all t £ CI 

f j (t) = fJ(t) = (Sx(t),Y j ), 
where (•, •) denotes the standard inner product in WL n and Yj = G 

Assumption 6.2. For all A G A and all £ = l,...,n, the function t i — )• S^(i) belong to L 2 (Vt), where 
S{(t) denotes the l-th component of the vector S\(t). Moreover, for all A G A, / G J and # G 0, i/ie 
function 1 1 — >• (S>(t), Tgf) belongs to T where T#f = (Tgf(tg)) 

In the case (|1.2p of randomly shifted curves with an equi-spaced design, then Assumption 16.21 holds 



with = n Yl\k\<\ e t27Tk ( t . Let us now specify how the bias/variance behavior of the linear 

estimators f x depends on the smoothing parameter A. For this, consider for some function / G J- the 
following regression model 

Y e = f(t e ) + ae £ , 1 = 1,..., n, 

where the e/s are i.i.d normal variables with zero mean and variance 1. The performances of a linear 
estimator f x (t) = (S\(t),Y), where Y = (Ye)™ =1 , can be evaluated in term of the expected quadratic risk 
R x (f X ,f) defined by 

R x (f\f) :=E||(/ A -/)||^ 2 = / \B x (f,t)\ 2 dt + a 2 [ V x (t)dt, 

Jn Jn 

where B\ and V\ denote the usual bias and variance of / A given by B\(f,t) = (S\(t),i) — f(t) and 
V\(t) = II'S'aC*) Hk« , for t G Vt, where f = (/(i^))? =r Define also V(X) = f n V\(t)dt, and let us make the 
following assumption on the asymptotic behavior of the bias/variance of / A : 

Assumption 6.3. There exist a constant k(F) > and a real-valued function A i — > B(X), such that for 
all f G T, 

[|*A(/,0ll£> = W(Sx(-),i) -f(-)\\h < ^)B(X). 

Moreover there exists a sequence of smoothing parameters (X n )neN G A N with linin^+oo X n = +oo such 
that lim n ^ +00 B(X n ) =0 and lim n ^ +00 V(X n ) = 0. 

Let us illustrate Assumption 16.31 in the case of shifted curves with an equi-spaced design, and a 
smoothing step obtained by low-pass Fourier filtering. As in Section [5l take J- = H S {A) defined in 
dSHD - In this setting, V{X) = ^±±. It can be also checked that \\B x (f,-)\\ 2 L 2 < C(A)B(X) for some 

positive constant C(A) depending only on A, and B(X) = 2X ^ 1 + X~ 2s . Thus, Assumption 16.31 holds with 

i 

X n = n 23 +! . 

6.3 Random perturbation of the mean pattern / by the Z/s 
Assumption 6.4. For any n > 1, there exists a real 7n(@) > such that for any G O 

7max(E [T e Z(T Z)']) < 7 n(6) 
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where T#Z = (TgZittj) i € M. n , and j max (A) denotes the largest eigenvalue of a symmetric matrix A. 
Moreover, 

lim 7 n(e)vV(A„) = 0, (6.1) 

n— >oo 

where V(X n ) is the variance defined in Assumption \6.3l 

Intuitively, the condition (|6.ip means that the variance of the linear smoother S\(-) has to be asymp- 
totically smaller that the maximal correlations (measured by 7 n (G)) between TgZ(ti) and TgZ(t£i) for 
£, £' = 1, . . . , n and all 6 E G. In the case of randomly shifted curves with an equi-spaced design, a simple 
condition for which Assumption 16.41 holds is the case where Z is stationary process (see the arguments in 
Section 15. ip . 



7 Consistency in the general case 

7.1 Consistent estimation of the deformation parameters 

Consider for A G A the following estimator of the deformation parameters 

6 X = argminM^(0), 
ee® 

where 

Ma(6) = jE / (^(^W,Y i )-i^f .,<5 A (t),Y / >ydt ) (7.1) 
j=i Jn ^ j'=i ' 

and is the constrained set introduced in Assumption 14.31 The estimator 6 thus depends on the choice 
of 0, and it will be shown that is a consistent estimator of the vector #q £ MP J defined in Assumption 
14.31 Note that depending on the problem at hand and the choice of the constrained set 0, it can be shown 
that #@ is close to the true deformation parameters 9*. For example, in the case of shifted curves, if 
= ©o defined in (14. 2\i and if the density g of the shifts has zero mean, then 0© o = (6\ — 6*,...,0j — 6 ) 
with 6* = j Xyj=i @j can be shown to be close to 0* (see Lemma |C. II in the Appendix). This allows to 

show the consistency of to 6* as formulated in Theorem 15.11 Therefore, the next result only bounds 
the distance between and 6* & . 

Theorem 7.1. Consider the model (11. 5p and suppose that A ssumptions \4-3[ \4-4\ and \6.1\ to \6.4\ hold 
with n > 1 and J > 2. Then, for any A E A and x > 

1 „i.A 



v fi -OUWvj >Ci(9, 0, T, f) [( 7n (e) + a 2 ) [y/v{x, J, A) + v(x, J, A) 



+ (yiJ(A) + B(A)Jj 1 <2e-*, (7.2) 

with Ci(@, ®,J-,f) >0, v(x,J,X) := V(X) (l + 4f + ^/if ) . 

Using Assumptions 16.31 and 16.41 it follows that limj^+oo j n (Q) i^y'v{x, J, A n ) + v(x, J, A n )^j = for 

any x > and J > 2. If J remains fixed, Theorem 17.11 thus implies that 6 converges in probability to 
6q as n — > +oo. To the contrary, let us fix n, and consider an asymptotic setting where only J — > +oo. 
For any x > and A £ A, limj_j. +00 -u(x, J, A) = 1^(A). Therefore, Theorem 17.11 cannot be used to prove 

that 6 converges to 0@ as J — > +oo. This confirms that 6 is not a consistent estimator of 0@ (and 
thus of 9*) as n remains fixed and J tends to infinity. 
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7.2 Consistent estimation of the mean pattern 

Recall that the estimator f x of the mean pattern / is defined as f x = 7^=1%//' We study the 
consistency of / A with respect to the shape function 



1 J 

i=i 

defined for 0@ = ([#©]i, ■ ■ ■ , [0©]j)- Again, depending on the problem at hand and the choice of the 
constrained set 0, it can be shown that /@ is close to the true mean pattern /. For example, in the case 
of shifted curves with = O defined in g^J), then &0 = (0J - 6*, . . . , 0} -0*) with 0* = j£/ =1 0*. In 
this case /& o (i) := 7 E/=i + [00 o l?) = Hence, under the condition that J e 0g(6)d6 = 0, 

then ~ for J sufficiently large, and thus fL(t) is close to / which allows to show the consistency of 
/ A to / as formulated in Theorem 15.21 

Theorem 7.2. Consider the model (ll.5p anc? suppose that Assumptions !! . i\ \4-3\ and [b\l\ to \6.4\ hold. 
Then, for any A £ A and x > 

p(||/ A " f&Wh >^(e, 0,.F, /) [( 7n (e) + a 2 ) (V«(x,J,A) + v(ar, J, A)) 

+ (^B{X) + B(X))}) <2e~\ (7.3) 



where C2(@,®,J r ,f) > is a constant depending only Q, &, T , and f. 

The consistency of f x to /@ is thus guaranteed when n goes to infinity provided the level of smoothing 
A = A n is chosen so that linin^+oo V(X n ) = lirrin^+oo B(X n ) = 0. Again, if n remains fixed and only J is 
let going to infinity then Theorem 17.21 cannot be used to prove the convergence of f x to 

8 Numerical experiments for randomly shifted curves 



Consider the model (13. 3p with random shifts 6j having a uniform density g with compact support equal 
to [— |, |], and f(t) = 9sin(27ri) + 2cos(87rt) for t G [0, 1] as a mean pattern, see Figure 2(a) For the 
constrained set we took 

O = {0G ["i|] J , 0i + ---+0j = o}. 

We use Fourier low pass filtering with spectral cut-off to A = 7 which is reasonable value to reconstruct 
/ representing a good tradeoff between bias and variance. We present some results of simulations under 
various assumptions of the process Z and the level a of additive noise in the measurements. 

Shape invariant model (SIM). The first numerical applications illustrate the role of n and J in the 



SIM model. Figure 2(b) gives a sample of the data used with a = 2. The factors in the simulations are 



the number J of curves and the number of design points n. For each combination of these two factors, we 
simulate M = 20 repetitions of model ()3.3p . For each repetition we computed j||0 — 6*\\ 2 and ||/ A — /|| 2 2- 
Boxplot of these quantities are displayed in Figure 3(a) and |3(b)| respectively, for J = 20,40, . . . , 100 and 



n = 512 (in gray) and n = 1024 (in black). As the smoothing parameter is fixed to A = 7, increasing n 

1 3 



simply reduces the variance of the linear smoothers f^. Recall that the lower bound given in Theorem 



shows that -jE[||0* — || 2 ] does not decrease as J increases but should be smaller when the number 
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Figure 2: (a) mean pattern /. |(b)| J = 3 noisy curves in the SIM with a = 2. (c) J = 3 noisy curves with 
(7 = and a stationary process Z with <j = 4. 



a a S 



20 40 60 

J 



(a) 



(b) 



20 40 60 80 100 

J 



Figure 3: Boxplot of j||0 A - 6* &Q || 2 [(a)] and ||/ A - /e ||| 2 [(b)] over M = 20 repetitions from a SIM model 
of shifted curves. Boxplot in gray correspond to n = 512, and in black to n = 1024. 



of point n increases. This is exactly what we observe in Figure [3j Similarly, the quantity ||/* — /||?a is 
clearly smaller with n = 1024 than with n = 512. 

Complete model. We now add the terms Zj in (|3.3p to model linear variations in amplitude of the 
curves around the template /. First, we generate a stationary periodic Gaussian process. To do this, 
the covariance matrix must be a particular Toeplitz matrix. As suggested in |Gre93j one possibility is to 
choose 

*(t-l/2) + e -tf(t-l/2) 
R ® = ? e */2 + e -0/2 ' 

4) and ? a variance parameter. The level of additive 
4. As an illustration, in Figure 2(c) we plot / + Zj, j = 1,2,3 with 



where 4> is a strictly positive parameter (we took 
noise is a = 8, and we took ? 



q = (p = 4. Over M = 20 repetitions, we have computed the values of j\\6 — #© || 2 and — ./(->,, L _. 
for J is varying from 20 to 100 and n = 512, 1024. The results are displayed in Figure [4(a7] and [4(b)] We 

observe the same behaviors than in the simulations with the SIM model: the variance of j\\6 — #© || 2 
does not decrease as J increases (see Figure 4(a) ) and — /© 1| 2 has a smaller mean and variance as n 
increases. 



/e l 12 



We finally run the same simulations with a non stationary noise Zj(t) = ajip{t) where ip is a positive 
periodic smooth deterministic function such that H^H^a = 1 and aj ~ AA(0,? 2 ) with <r = 4. Note that, in 
this case, the sequence 7n(@) is of order n and Assumption 16.41 is not verified. The levels of noise {a and 



16 



4r 
3.5 

3 
2.5 

2 
1.5 



0.5 

0- 



60 
J 



(a) 



60 
J 



(1)) 



Figure 4: Boxplot of j 
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(a) and j\\f x ~~ /©oil 2 l(b)l m m °del (|3.3j) with a stationnary error 



term Z. Boxplot in gray correspond to n = 512, and in black to n = 1024. 



<j) are the same than in the stationary case in order to make things comparable. The results are presented 

in the same manner in Figure 5(a) for j\\6 — 0* &0 \\ 2 and in Figure [5(b)| for — /© ||^ 2 - One can see 
that the results are very different. The estimators of the shifts have a much larger mean and variance, 



and the variance of j | 



e 



#@ () || 2 remains rather high even when n or J increases (see Figure 5(a) ). The 



convergence to zero of — /© || 2 2 which was clear in the stationary case, is now not so obvious in view 
of the numerical results displayed in Figure |5(b)| 
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Figure 5: Boxplot of j || 



| 2 (a) and j||/ A — /e 1| 2 [(b)]in model (j3.3[) with a non-stationnary error 
term Z. Boxplot in gray correspond to n = 512, and in black to n = 1024. 
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9 Conclusion and perspectives 

We have proposed to use a Frechet mean of smoothed data to estimate a mean pattern of curves or 
images satisfying a non-parametric regression model including random deformations. Upper and lower 
bounds (in probability and expectation) for the estimation of the deformation parameters and the mean 
pattern have been derived. Our main result is that these bounds go to zero as the dimension n of the 
data (the number of sample points) goes to infinity, but that an asymptotic setting only in J (the number 
of observed curves or images) is not sufficient to obtain consistent estimators. An interesting topic for 
future investigation would be to study the rate of convergence of such estimators and to analyze their 
optimality (e.g. from a minimax point of view). 
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A Proof of the results in Section [3] 
A.l Proof of Theorem SU 

Write 0* = ([9*]], ... , [6*] p ), and let Y = (Yi, . . . , Yj) £ R nJ be the column vector of the observations 
generated by model (|1.7p . Conditionally to 6*, Y is a Gaussian vector and its log- likelihood is equal to 



log(KY|0*)) = ~ log(2vr) + J - log(det(A)) - \ - T *f)'A(Y, - T .f), (A.l) 

where A = a~ 2 Id n . Therefore, we have the expected score E#* [cta.iPi log(p(Y|0*))] = for all ji = 
1, . . . , J and pi = 1, . . . ,p and 

E r [^ r iog(p(Y|r))a rP i O g(p(Y|r))] = { r , p l J !tf 

(A.2) 

where c^.iPiTg* f = [cta.iPiTg* /(^)]™ =1 - Then, for each ji = 1, . . . , J and p\ = 1, . . . ,p we have 

(V]f^ f )' A (^*]^ T ^ f ) < ^"'H^jPiT^fH 2 < C(QJ)na- 2 , (A.3) 

where the last inequality is a consequence of Assumption ^. 11 From now on, 9 = 0(Y) = (<?i(Y), . . . , <?i(Y)) 
is an arbitrary estimator (i.e any measurable function of Y) of the true parameter 6*. Let also 



U = 0-0* and V 



[«9 rr log(p(Y\0*)g(0*))] p pi=1 , . . . , [d m7 log(p(Y\0*)g(0*))] p pi=1 



be a matrix of column vectors of MP J . Then, Cauchy-Schwarz inequality implies 

(E[U'V}) 2 < E[U'U]E[V'V}. (A.4) 
In the sequel we note g J (0)d0 = g{0\) . . . g(Oj)dO\ . . . dOj. We have 

J P 



nU'V] = ^f^ [ [ (ef(y)-[e)f)d^ r ( V {y\0)g J {0))dOdy 



jf=l Pl=l 



EE/ / ffvpw fl )* d9 * 



j=i pi=i 



Assumption 11.11 and the differentiability of g imply that for all p\ = 1 , . . . , p and all 6 we have 
lim0Pi_s.p g{0) = 0. Then, an integration by part and Fubini's theorem give f Qj d^*jpi {p(y\0)g J (0))d0 = 

0. Again, with the same arguments, f &J [O] Pl d^ 0t ^i (p(y\0)g J (0))d0 = — f Qj p(y\0)g J \0)d0 and thus 

E[U'V] = pJ. 
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Now, using that the expected score is zero and equation ()A.2|) we have 



nvv] = E E E [(<V]f iog(p(Y|r)) 2 ] +E[(a r]f icg( 5 (r)) 2 
j=i pi=i 



E E l eJ ( d m? T9 ^' A (d [g . ]Pjl T g j)g J (e)de + j Jjd 01 logCKflO)!! 2 ^!)^! 



where 5 01 log (#(0i)) = [<9[<j]i log (5(^1)) , • • • , 9 [0] p log (#(0i))] G Then, using inequality |A3j it gives 
E[V'V] < pJnC(e,f)a- 2 +J f @ \\d 01 log (#(0i)) || 2 g{0{)dOi. Hence, using equation ([Ol) for any estimator 
= 0(Y) (see Theorem 1 in |GL95| ) 



E 



10 - 0*\\ 2 



> 



pj 



> 



nC(e, f )a- 2 + p-i J G ||3 fll log (5(00) f g(G 1 )dG 1 
a 2 n~ l p,J 



C(0, /) + n- V 1 ^ 2 / e 11^ log (g(0i))|| 2 5 (0i)d0i 
And since p > 1, the claim in Theorem 13,11 is proved. 



□ 



A.2 Proof of Theorem EE21 

As above, let Y 6 M nJ is the column vector generated by model (|1.5p . Then, conditionally to 0*, Y 
is a Gaussian vectors and Assumption 13,21 ensures that its log-likelihood has the same expression as in 
equation (jA.lj) but with 

A = A(0) = (a 2 Id n + E * [T e .Z i (T e .Z j )'])- 1 = (a 2 /d„ + E^©))' 1 

As the matrix £ n (0) is positive semi definite with it smallest eigenvalue denoted by s 2 (0) (see Assumption 
13, 2p . the uniform bound (|A,3|) becomes 

(V]g\ f )' A ( ) (Vft 1 ^ ^ ( ct2 + 4(0))^ll fy^^f II 2 ^ c ( ' ^M^ 2 + 4(e)) -1 . 

for all p\ = 1, . . . ,p and j = 1, . . . , J. As above the last inequality is a consequence of Assumption 13,11 
and the rest of the proof is identical to the proof of Theorem 13.11 □ 



A.3 Proof of Theorem EOl 

For all € M the operators Tgf(-) = /(• — 0) are isometric from L 2 ([0,1]) to L 2 ([0, 1]) as a change of 
variable implies immediately that ||Te/|| 2 2 = ||/|| 2 2- For all continuously differentiable function /, we 
have dgTefit) = —sign(0)dtf(t — O), where sign(-) is the sign function. Then, for all £ 0, Hc^Tfl/ 1| 2 2 = 
||5t/|| 2 2 < ll^i/lljL and Assumption 13.11 is satisfied with C(0, /) = HSt/H^Q. Finally, as the error terms 
Zj's are i.i.d stationary random process the covariance function is invariant by the action of the shifts 
and Assumption 13.21 is satisfied with £ n (0) = T, n defined in (15. ip (see Section ISTTI for further details). 
Then, the result of Theorem 13.31 follows as an application of Theorem 13.21 □ 
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B Proof of the results in Section [4] 
B.l Proof of Proposition I4TT1 



Remark that D(0) = EfcezKI 2 ( 1 



1 J2nk(0 J -e*) 

J 2- 3 -=i e 



Assumption 14, 1\ it follows that for any 6 0, 

£>(6>) > |cj| 2 ( 1 

with c\ 7^ 0. Then, remark that 



I , where c* k = J Q f(t) e - i2nkt dt. Thanks to 



1 J 2 \ 



(B.l) 



, J 2 



i=i 



7 + ^E E cos(2vr((^-^)-(^-^))) 



i=i j'=j'+i 



Using a second order Taylor expansion and the mean value theorem, one has that cos(2-7ru) < 1 — C(p)\u\ 2 
for any real u such that \u\ < 4p < 1/4 with C(p) = 27r 2 cos(87rp) > 0. Therefore, the above equality 
implies that for any 6 £ 



^-4e e c(p)K^-0*)-(^-^)i 2 , 



since \(6j — 9*) — (9j> — 0*,)\ < Ap < 1/4 for all m, q = 1, . . . , n by Assumption 14.21 and the hypothesis 
that p < 1/16. Hence, using the lower bound (jB.ip . it follows that for all G 



J-i J 



(B.2) 



i=i j'=j+l 



with C(f,p) = 2\c\\ C(p). Now assume that G ©o- Using the properties that X)j=i ^? = and 
£/=i(0j-0j) = - E/=i °j = J 0\ ^ follows from elementary algebra that ± £// =i+1 \{ e j ~ d j) ~ i e j' ~ e ]< 

Ylj=i(Qj ~ (®j ~ #*)) 2 - This equality together with the lower bound (|B.2|) completes the proof. □ 

C Proof of the results in Section [5] 
C.l Proof of Theorem ED 

Let us state the following lemma which is direct consequence of Bernstein's inequality for bounded random 
variables (see e.g. Proposition 2.9 in [Ma s07| ) : 

Lemma C.l. Suppose that Assumption\4.2\ holds. Then, for any x > 



■(X-rnw(/f + ^) 2 )< 2e - 
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Using the inequality ±||0 A - 0*|| 2 < %\\0 X - 6* &Q \\ 2 + j\\0* &Q - 0*\\ 2 , it follows that Theorem O 
is a consequence of Lemma IC.ll and Theorem 17.11 Indeed, it can be easily checked that, under the 
assumptions of Theorem 15.1} Assumptions 16,11 to 16,41 are satisfied in the case of randomly shifted curves 
with an equi-spaced design and low-pass Fourier filtering, see the various arguments given in Section [6]). 
The identifiability condition of Assumption 14,41 is given by Proposition 14.11 □ 



C.2 Proof of Theorem IQ1 

Consider the following inequality ||/ A - f\\ 2 < 2\\f x - f &0 \\ 2 + 2||/® - f\\ 2 , where f &0 (t) = f(t - 0*) 
and 6* = jYlj=i@j £ ©■ As / is assumed to be in H S (A), there exists a constant C(0, /) > such 
that ||/ 0O - /|| 2 2 < C(e,f)\e*\ 2 = C(@,f)±\\G* &0 - 6*\\ 2 . As explained in part[Cj]the assumptions of 
Theorem 15.21 are satisfied in the case of randomly shifted curves with an equi-spaced design and low-pass 
Fourier filtering. The result then follows from Theorem 17.21 □ 



C .3 Proof of Theorem l5\3l 

Let n > 1. We have that 



E[||/ - f\\ L 2] = E||/ - / eo + / 0O - f\\ L 2 > E\\f - f &0 \\ L 2 - E\\f &0 - f\\# 



(C.l) 



where for all t G [0, 1], f(t) = $ £/ =1 /(* " % + and f &0 (t) = f(t + 0*), with 0* = * £/ =1 
In the rest of the proof, we show that A is bounded from below by a quantity Co(f,g,n,a 2 ,p) = 



c(/,p)i 



fkfWlo+n- 1 * 2 /eWsMflW)) 2 



independent of J (this statement is made precise later) and that B 



goes to zero as J goes to infinity. Then, these two facts imply that there exists a Jo 6 N such that J > Jq 
implies that E||/ — f\\jji > ^Co(f,g,n,a 2 ,p), which will yield the desired result. 



Lower bound on A. Recall that c* k = J f{t)t 



i2-rrkt 



dt, then 



/eollL 2 = ll 7 E/(-- i+^)-/(- + r 



2n i 
2\ 2 



> cT 



where 0@ o = (0* — 0*, . . . , 0} — 0*), the right hand side of the preceding inequality being positive since 
Assumption 14.21 ensures that c\ / for all j = 1, . . . , J. Let Uj = 27r(0 J - — [0® o ]j);J = 1, ... , J . Since 
Ylj=i Uj = and \tij\ < 4-7T/9 < 3, j = 1, . . . , J (by our assumption on p), Lemma fE . 1 1 implies that 



\\f-f® \\L2>c(f,p)j\\d x -o* &0 f- 



(C.2) 



Now, remark that E[±||0 A - 0£,J 2 ] > E[±||0 A - 0*|| 2 ] - C with C = 2E [ 1 0* \ ± E/=i l#j - By 
applying Theorem 13.31 we get that 



E [i||0 -0*|| 2 ] >C(f,g,n,* 2 ), with C(f, g, n, a 2 ) 



n 1 a 2 



l^/l^ + n-^/e^log^))) 
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Then, remark that C < 4py E \ 0*\ 2 < C(p,g)J 1 ^ 2 . Hence C tends to as J goes to infinity. Therefore, 
using equation (|C.2p . it follows that there exists Cb(/, g, n, a 2 , 7, p) > and J\ E N such that J > J\ 
implies that 

A = E[||/ A -/|| i2 ] >C (f,g,n,a 2 ,p). (C.3) 

Upper bound on B. By assumption, / is continuously differentiable on [0,1] implying that ||/© — 

/lltf = ll/C + f ) - < llflt/lloo Therefore, E||/ 0O - f\\ L 2 < HSt/H^ ^\d*\ 2 < C(f,g)J-V 2 . 
Hence, there exists a J2 E N such that J > J2 implies 

B = E[||/© -/|| La ] < ±C (f,g,n,a 2 ,p). (C.4) 

To conclude the proof, equations (jC.ip , (|C.3p and (|C.4p imply that there exists a Jo E N such that J > Jo 
implies E||/ A - f\\ L 2 > I A - B| > ±C (/, g, n, a 2 ,p). □ 

D Proof of the results in Section [7] 
D . 1 Proof of Theorem [TUJ 

We explain here the main arguments of the proof of Theorem 17.11 Technical Lemmas are given in the 
second part of the Appendix. Let 6 = [0\, . . . , Oj) = (Q\, . . . , 6\, . . . , 9j, . . . , OK) E W J and decompose 
the criterion f)7. 1 1) as follows, 

J J 1 

Mx{0) = 7 E / (% (S\ n (t),Yj) % (S^Yf) ) dt 

j=i Jn V j'=i J 

= D(0) + [Rx(0) + Q X (6)] + [q z x {6) + Rf (0) + R z x ' £ (0) + Qf x {9) + R%(e) , 

where D(0) = ± ]P/=i Jn (^e 3 Tg*f(t) - ± ^ J , =1 fe.,T e *J(t)^ dt, the terms i? A and Q A are due to the 
smoothing, namely, 

Qx(0) = jJ2[ (%B x (T e *J,t)-jY,% B ^ T o}f^) dt 
j=i ^ n ^ j'=i ' 

x (f e .B x (T e *f,t) -W f e .,B x (T e .,f,t)\dt, 
V ?'=i ' ' 
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and the others terms contain the Zj'a and e^'s error terms. Let Tg*Zij = (Tg*Zj(t£)}™_^ and Tg*f 
(T g * f(t e )) n e=v then 



x[f .(Sx(t),T 9 .Z 3 



R Z /{Q 



2a 
1 



j=i Jn v ' j'=i J 

j=i JU v ' j'=i J 

1 J \ 
■jE^ (Sx(t),T e; Z f ))dt, 

j'=i ' 

E / fa (Sa^T^Z,) - I £ T 0j , (s A (i),T » Z,,)) 
i=i ^ j'=i ' 

% {S x (t), ej ) % (Sx(t),s f ) Ut 

j'=i J 

j=i ^ n ^ j>=i ' 

^) = yE / fa T fl ;f ) f 6jl (Sx(t), T 9 . f ) ) 

T 0j (S A (t), ej -) - j E % (SxQU^dt. 

j'=i / 



At this stage, recall that 0@ = argmin 0g @ and = argmin 0g @ M x {6). The proof follows a classical 
guideline in M-estimation: we show that the uniform (over 0) convergence in probability of the criterion 

Mx to D, yielding the convergence in probability of their argmins 6* & and respectively. Assumption 
14.41 ensures that there is a constant C(0, J 7 , /) > such that, 

j\\o x - e* @ \\ 2 < c(e,®,T,f) \d(g x ) - d(o* @ )\ (d.i) 

Then, a classical inequality in M-estimation and the decomposition of Mx(6) given above yield 

D(e X )-D{6* & ) < 2 sup \D(0) -M x {0)\ (D.2) 
ee® 

= 2 sup \r x (6) + Q x (6)\ + 2 sup \q z (0) + R z {9) + R z x £ {6) + Q\{6) + R x (6)\ 



The rest of the proof is devoted to control the B and V terms. 



v 



Control of B. Using Assumption O and EH we have that Q x (0) < ^p-J2j=i B \( T e*f,t) 



< 



L 2 



C(e,J r )B(X). Now by applying the Cauchy-Schwarz inequality, \R X (0)\ < sup{ ^ D{6)}^fQ x {0). By 
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Assumption 16. 1\ there exists a constant such sup{D(6)} < C(Q,J r ,f) and thus 

060 



B<C(e,T,f)(B(X) + y/B(X)). 



(D.3) 



Control of V. We give a control in probability of the stochastic quadratic term Qf and Q%. As 
previously, one can show that there is a constant C(0, J 7 , /) > such that, 

Qf (6) + Rf (0) + R z x ' e (0) + + Rl(6) | < C(9, .F, /) (yjQl{e) + Qf (0) + Q\{0) + y/o*jff)) , 

where we have used the inequality 2ab < a 2 + b 2 , valid for any a, b > to control the term Rf ,£ - The 
quadratic terms Qf and Q| are controlled by Corollaries IE.1I and IE.2I respectively. It yields immediately 
to 

P (V > (7(6, J 7 , /)( 7 max(n) + a 2 ) (v(x, J, A) + y/v(x, J, A)) ) <2e~ x , (D.4) 

where v(x, J, A) = V(X) (l + 4f + y^if ) . 

Putting together equations ([TXT]) . (jTX2]) . (fD~3"l) and (JD~4|), we have 

j||0© - > C(G, 0, ^, /) [( 7 max(n) + a 2 ) ( v^^A) + v(x, J, A)) + (#(A) + y^^A) 



which completes the proof of Theorem 17.11 



< 2e~ x , 
□ 



D.2 Proof of Theorem rO 

In this part, we use the notations introduced in the proof of Theorem 17. II We have, 

^ J 



fe-F 



- l^V^ & \ 3 T e *f -f [e * &]j (S x {-),T e *i 



I? - J 



3=1 



2 
L- 



2 ' 



3=1 



f [0Uh (s x (-),T e1 {)-f.> (Sxi-lYj 



L 2 



Again, the first term above depends on the bias, and the second term (stochastic) can be controlled in 
probability. Under Assumptions 16.11 and 16.31 we have that 



B' < ^J2\\(s x (-),T e *f)-T e *f [ 2 < C(@,T)B(X), 



3=1 



and 



3=1 



T m . (S x (-),T 9 *f)-T^ (S x (-),T *f) + T^ (S x (-),Tg*f) — T-a (S X (-),Y 



2 

L 2 



< 



c(e,r) 
j 



Efe-[^e]ill 2 + ||(5A(0,Y i -T 9 .f)r) > 

7=1 V 7 
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< c(q,t) (h\e x - * & f + ji2 t *z, + e,)|| 2 L2 ) 

V 7=1 J 



The stochastic term i E^ =1 / S A (-), T 0* z j + £ . 



L- 



in the above inequality can be been controlled using 



Lemma IE. 21 and the arguments in the proof of Corollaries IE, II and IE, 21 to obtain that for any x > 



J 



^|(s A (-),T *Z i + £ i ) > C(G,.F, /)( 7 max(n) + a 2 ) Uv{x, J, A) + v(x, J, A)) ) < e"*. 



Then, from Theorem 17.11 it follows that 



B' + V > C(G, 0, J 7 , /) ( 7 max (n) + a 2 ) u(x, J, A) + J, A) + B(A) + ^B(A) 



< 2e~ x , 



which completes the proof. 



□ 



E Technical Lemmas 

Lemma E.l. Let u = (u%, . . . ,uj) such that Y2j=i u j = with \uj\ < 5 for some < 5 < 3 for all 
j = 1, . . . , J. Then, there exists a constant C(5) > such that \j^2'j = i(e 1 ' Ui — 1)| > ^p- \\u\\ 2 where 



i 1 1 2 o o 

|u|| = U{ + . . . + Uj. 



Proof. Let F{u\, . . . ,uj) = j Ylj=i eiUj ■ A Taylor expansion implies that there exits ij 6 [— (5, 5], j 
1, . . . , J such that 

J 



F(n 1 ,...,n J ) = l + ^n j - — ^n 2 --^-^ 



./ ^- 2./ ^ •' 6J^- 3 ' 

J= i J= i J= i 



holds for all \uj\ < 5. Now, since Ef=i u i = it follows that 



J 



> 



2J 



E 



n 2 



j 



n 3 e^ 



Since \uj\ < 5, we have that 



I E/=i u j elij < | E/=i Kf which finall y implies that ± Y?j=\ e ™ J ~ 1 



> 



^7 Ef=i u % which proves the result by letting C{8) = ^ > since 5 < 3. □ 



6 J Z^j=l 



Lemma E.2. Xei £a,j(Ai, • • • > A/) = 7 E IK^aCO)^'^')!!^? w/iere £j ~ jV(0,J n ) and i/ie Aj 's are 
nonrandom non-negative n x n symmetric matrices. Then, for all x > and all n> 1, 

P . . . , Aj) > i || A|| (l + 4^ + < e-». 



J n 



where ||A|| = ^ ^ r^f wit/i r^f feeing the i-th eigenvalue of the matrix Aj = Aj {S^ X ,S^) L 2 

3=11=1 ' 1 ' 



A; 
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Proof. Some parts of the proof follows the arguments in |BM98j (Lemma 8, part 7.6). We have 

J n 2 -i J n i J 

i * — ^ * — ^ 



-. J n 2 J n -. J 

1=1 i=\ L a j=i e/'=i 7=1 



where A 



J 

3=1 1=1 
AjS\Aj with Sa 



(sis' 



. Now, denote by r^i > . . . > r J)n the eigenvalues of 

Aj with r^i > . . . > rj, n > and ri = max^jrj^}. We can write Aj = (S\2 Aj)'(S\2 Aj) and is positive 

J 

semi-definite. Then, let £a,j = J£\,j — J^S,x,J = Yl ( £ 'j-^-j £ j ~ tr A.,). Let a > 0, by Markov's inequality 



i=i 



it follows that for all u € ( 0, J- 



2ri / ' 



> a) = We"^ > e ua ) < e- ua \[ J j=l ^ 



gUEj'AjEj — u tr Aj 



since the e^'s are independent. The log-Laplace transform of <p\,j = Ej'AjEj — tr Aj is log (E [V"^']) = 
Ylt=i ~ ur j,£ ~ \ 1°§ (1 ~~ 2 ur j,e) ■ We now use the inequality — x — \ log(l — 2x) < for all < x < ^ 



which holds since u G (0, ^j. This implies that log (E [e u ^>j] ) < E"=i T 



2 2 

n u r j,i < - 



-2ur 



■j,t 



l-2«n ' 



|rj|| = rij i + . . . + r„ „•. Finally, we have 



'(VV > «) < ex P ( - (ua - £ Y^)) 



exp 



II l|2 2 

||r|| u 
1 — 2r\u 



where 



(E.l) 



where ||r|| 2 = Ef=i S"=i r j,t 2 - The right hand side of the above inequality achieves its minimum at 



i 



1 



2ri V V 2cCT -i+IMI 2 
valid for all x > —1, one has that 



. Evaluating (|E.1|) at this point and using the inequality (1 + x) 1 / 2 < 1 + 



X 

2 ' 



£a,j > a) < exp 



a: 



< 



exp 



2r ia + 2 ||r|| 2 + 2 ||r|| 2 (1 + 4an/(2 Hrf)) 1 ^ 
a 2 



4ri« + 4 llrl 



2 / ' 



by setting x 



p. We derive the following concentration inequality for = j£a,j+j E/=i t r (A 



4ria+4||r 



3 )i 



>U\,j > 7 E/=i £2=1 r j/ +4a x+ M^^ < e -z. To finish the proof, remark that ||r|| 2 = £/ =1 Ee=i r\ t < 



£f=i £™=i r j,i) since all the r^'s are positive. 



□ 

Corollary E.l. Under Assumptions fOl to 1 6. 31 there exists a constant C(Q,J-) > such that for all 
x > 0, 

/ / x r~x\\ 

< e . 



sup Ql(0) > C(Q,T)a'V(X) (^1 + 4- + ^/4 J 




Proof. Assumption 16.11 gives the uniform bound 



./ 



Q1(0)< jE / (f 0J (S x (t),ae,)) 2 dt <^^^2\\(S x (t),aeM 2 L 2 

3 = 1 JQ 



3=1 



C(Q,F)i^j(aId n , . . .,ald n ), 



where £,\ t j(o-Id n , . . . , o~Id n ) is defined in Lemma lE.21 and does not depend on 0. Thus, the result imme- 
diately follows from Lemma IE. 21 □ 
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Corollary E.2. Under Assumptions ROl to \ 6.4\ there exists a constant C(Q,J-) > such that for all 
x>0, 



sup Ql (0) > C(G, T) ln (G)V(\) 1 + 4- + 



oe® 

Proof. Assumption 16.11 gives the uniform bound 

J 

J 



i=i 



f .(5 A (t),T^Z, 



J 




4 



< e" 



J 



2 

L 2 



Hence, conditionally on 9* we have that sup 0G Qj (jf (#) < C(G, • • • j A/) , where £a,j(A 5 • • • , ^4j) 



is defined in Lemma [E.2l with Aj = Kg* ^Tg*Zij(Tg*Zij)'^ 2 . Let us now give an upper bound on the largest 



[(s{,s : 



Under Assumption 16.41 we have 



eigenvalues of the matrices Aj = AjS\Aj with S\ 

that tr(Aj) < 7 max ( j 4j) t r Sa < j n (Q)V(X), for all j = 1, . . . , J and any 6* S Q J . Thus, the result follows 
by arguing as in the proof of Lemma lE.21 and by taking expectation with respect to 0* . □ 
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